Correcting the reproduction number for time-varying tests: A proposal and an application to COVID-19 in France

We provide a novel way to correct the effective reproduction number for the time-varying amount of tests, using the acceleration index (Baunez et al., 2021) as a simple measure of viral spread dynamics. Not correcting results in the reproduction number being a biased estimate of viral acceleration and we provide a formal decomposition of the resulting bias, involving the useful notions of test and infectivity intensities. When applied to French data for the COVID-19 pandemic (May 13, 2020—October 26, 2022), our decomposition shows that the reproduction number, when considered alone, characteristically underestimates the resurgence of the pandemic, compared to the acceleration index which accounts for the time-varying volume of tests. Because the acceleration index aggregates all relevant information and captures in real time the sizable time variation featured by viral circulation, it is a more parsimonious indicator to track the dynamics of an infectious disease outbreak in real time, compared to the equivalent alternative which would combine the reproduction number with the test and infectivity intensities.


Introduction
The reproduction number is a widely used measure of how fast a pathogen propagates both at the outset and during an infectious disease outbreak (see for example [1]). One of its major shortcomings, however, is that it does not control for the quantity of tests (or any diagnostics) performed in real time. Doing so is of crucial importance for two reasons. One is the fact that accurate empirical estimates of reproduction numbers are time-varying in nature (see e.g. [2] among many others). A considerable source of time variation comes from the fact that the amount of tests changes substantially across time and hence affects the number of known cases, due to demand and supply effects. Second, testing acts as a magnifying lens on viral activity at least on the part of the population that has effectively been tested. The reproduction number however does not rely on that information but rather makes assumptions on infectivity, based on the observation of onset of symptoms and transmission in closed systems such as households (see e.g. [3]). But this information, again, depends on tests (or any diagnostics more generally). Hence inferring infectivity from (assumed) transmissions is only secondary information based on the availability of tests. Soon after the onset of COVID-19, it was possible to diagnose people using PCR tests. From a public health perspective, this is of course a much more favorable situation, compared to other infectious diseases for which biological tests are either nonexistent or available much later after the disease has been discovered. Widespread testing allows early care and treatment whenever diagnostics are performed. In addition, it is a rather trivial observation that whenever information about how many tests are performed in a given period is available, that information should be used to assess the dynamics of viral spread. After all, positive and negative cases do not fall from heaven; quite to the contrary, they become visible only through the lens of testing. Considering only positive cases but ignoring tests means ignoring also how many negative cases are out there, which is the flip side of the disease and informs about how many people are not infected in a given population. The latter is a relevant and useful piece of information since incidence of the infectious disease should ideally be measured against the tested population, not the entire population which includes people with unknown health status. Given that it is hardly possible to think of reasons that would justify ignoring deliberately such data about the extent of testing, the question then becomes how to incorporate that bit of evidence into any indicator that aims at tracking the dynamics of viral spread. This is the core question that we address in this paper.
In [4][5][6][7], we have introduced an alternative and novel measure of viral spread in the context of COVID-19-the acceleration index. This measure considers the variation of cases relative to the variation of tests and thus avoids the shortcomings and addresses the important question mentioned above. The purpose of this article is to discuss the reproduction number in the light of our acceleration index, and to show that the former is actually a special case of the latter and in fact a less accurate metric of the pandemic's time-varying spread.
We examine this important issue in two steps. In Section, we start from the very definition of the reproduction number as a gross rate of growth of infected people, traditionally denoted R, and derive a general formula that connects it to our acceleration index that we denote ε. The acceleration index is an elasticity that measures the proportional responsiveness of cases to tests, and it can also be thought of as the ratio between the current and average viral speeds. More specifically, we present an explicit measure of the ratio between R and ε, the interpretation of which is further discussed in terms of the infectivity and test intensities. Our theoretical inquiry stresses that while the acceleration index is a ratio of growth rates-that of cases divided by that of tests-the reproduction number tracks only the growth rate of cases. In other words, the acceleration index corrects the reproduction number for the time-varying amount of tests. Not doing so results in the reproduction number being a biased estimate of viral acceleration and we provide a formal decomposition of the resulting bias. The main conclusion we derive is that the reproduction number tends to overestimate (respectively underestimate) the dynamics of viral spread compared to the acceleration index when the amount of tests is large enough (respectively small enough).
In Section, we apply such an analysis to France, using an exhaustive data-set covering May 13, 2020, to October 26, 2022, and including pre-and post-vaccination periods. We show that there is a sizeable difference between both measures. Indeed, the reproduction number R largely under-estimates the spread of the virus, compared to our test-controlled measure of viral acceleration. This discrepancy is particularly severe for the epidemic wave due to the Omicron strains during autumn 2022. In S1 Appendix, we provide a similar analysis for five other countries to show that this result is not limited to the French case. It is in this sense that we say that the reproduction number is biased when tests are time-varying. This has obviously important consequences if the reproduction number is used as the basis for public health decisions such as entering or exiting a lock-down. We also look at the effects of the second lockdown period in France, which started October 30, 2020, through the lens of both indicators, as a further example that illustrates the bias unavoidably implied by not adjusting for the volume of tests over time when measuring the pandemic's acceleration.
A key conclusion follows from our theoretical and empirical results. If public health authorities aim at measuring as accurately as possible viral acceleration, they have to rely on one of the following strategies: track in real time either the acceleration index alone, or a combination of the reproduction number together with the test and infectivity intensities. Although both strategies are formally equivalent, the latter is not only less parsimonious, it is also arguably more delicate to operate in practice since one would then like to control the bias that inevitably comes from time-varying tests. This is one of the main reasons why we argue in favor of using the acceleration index.

Materials and methods
About a century ago, a series of seminal articles [8][9][10] have laid the foundations for a mathematical theory of epidemics. More specifically, their compartmental (that is, Susceptible, Infected and Removed or SIR-type) and time-since-infection models have been extensively used and refined in the academic literature about infectious and emerging diseases. A core concept in this paradigm is the reproduction number, usually noted R, which roughly captures how many secondary cases originate, on average, from a pool of primary cases who is still currently infectious (see again [1]).
As evident from publications by health agencies around the world since the onset of COVID-19, much of the guidance for designing policy measures to curb the pandemic relies prominently on estimates of R, among other things. The reproduction number is initially a theoretical concept, conceived to understand the transmissibility of an epidemic. Many efforts have been put into defining ways to empirically estimate it. Broadly speaking, estimation strategies fall into two broad categories. The first one rests on the basic SIR model (see e.g. [11] for a clear exposition), which predicts that the reproduction number R is the product of four parameters: the duration of infection, the number of contacts per case and the fraction of contacts who are in turn infected, on average, and finally the fraction of total population susceptible to infection. Although each of these parameters could be estimated in real-time, this turns out to be a gigantic task, in particular when a novel pathogen like SARS-Cov-2 emerges. A short-cut to avoid such a demanding procedure is to fit a SIR model using the number of cases, so as to estimate R directly, given the infection duration (see, among many others, [12] for a recent example related to Ebola using maximum likelihood estimation). This is feasible, even in real-time, provided that enough data is available to ensure precision and structural assumptions about the time-dependency of R are made. A caveat, though, is that such fitting procedures have limitations (see e.g. [13]). An additional issue arising from estimation based on compartmental models is the sizeable range of estimates. See [14] for SARS, and [15] for the early stages of COVID-19. The second estimation strategy addresses more directly the time-varying dimension of R, which is more in line with epidemiological and clinical data. Many health agencies rely on such estimates of time-since-infection transmission models rather than SIR-type models. Here the basic idea is that R is essentially (1+) the growth rate of infected, which is the ratio between the number of new (that is, secondary) cases arising, say, within 24 hours, and the number of primary cases (see again [2]). For example, the French agency in charge of health statistics uses the Cori method, after [3] (see https://www. santepubliquefrance.fr/content/download/266456/2671953). Other European health agencies are also using this method, e.g. Austria (see https://www.ages.at/download/0/0/ e03842347d92e5922e76993df9ac8e9b28635caa/fileadmin/AGES2015/Wissen-Aktuell/ COVID19/Methoden_zur_Schatzung_der_epi_Parameter.pdf) and Germany (see https:// www.rki.de/DE/Content/InfAZ/N/Neuartiges_Coronavirus/Projekte_RKI/R-Wert-Erlaeuterung.pdf?__blob=publicationFile).
Our task here is to relate the acceleration index defined in [5,6] and the reproduction number that is estimated using the time-since-infection approach just described. The main purpose of this section is to derive a theoretical relationship between both concepts, which helps both to explain why they are different, to give a sense of the magnitude of their difference, and to state the conditions under which they are equivalent. We then turn, in the next section, to data to gauge whether the difference between the two matters to track the COVID-19 pandemic.
Suppose that data is available about the number of tested and positive persons, up to end date T. Denote {p 1 , . . ., p T } the historical times series of the new (per period) number of positive persons from date t = 1 to end date t = T. Similarly, {d 1 , . . ., d T } is the historical times series of new (per period) diagnosed/tested persons. Denote P t ¼ P t t¼1 p t and D t ¼ P t t¼1 d t the cumulative numbers of positive and diagnosed persons up to date t.
As stressed in [6], accurate information about the dynamics of a pandemic rests on both the number of cases and the number of tests, and the former cannot be properly understood without the latter. In that paper, we introduce an acceleration index, denoted ε T at date T, which is an elasticity that measures the proportional responsiveness of cases with respect to tests. Given that the number of cases and tests are not necessarily varying at the same rate across time, groups and also space, the acceleration index measures the percentage change of cases with respect to a percentage change of tests and is thus unit-free. The acceleration index is defined as follows: Rearranging the terms of the latter equation, we see that the acceleration index relates to the daily and average positivity rates, in the following way: P T À P TÀ 1 D T À D TÀ 1 |ffl ffl ffl ffl ffl fflffl {zffl ffl ffl ffl ffl fflffl } daily positivity rate Eq (2) shows that the acceleration index is an elasticity, which is a concept widely used by economists since [16], to study the responsiveness of demand with respect to a change in price of a good. More precisely, the acceleration index is an elasticity that relates cumulated stocks (of cases to tests) over possibly extended periods if the epidemic lasts long. Mathematically speaking, such an elasticity measures the convexity of the relationship between cumulated cases and cumulated tests, using the non-local property that the linear approximation of any convex function provides a lower bound for that function at any point. In contrast, the second derivative is a local measure of convexity. Second, an important reason why we call such an elasticity an index of viral acceleration can be clarified using an analogy with linear body motions. Given that our analysis relies on data about cases and tests only, with the latter as units of measurement, one can think of the acceleration index as the ratio between current and average viral speed. With tests as the unit of measurement, the daily positivity rate p T /d T becomes a measure of current viral speed at date T, that is, the fraction of tested people that turn out to be positive on that day. The average positivity rate P T /D T at date T can be thought of as average viral speed, taken over the entire data sample. In S1 Appendix we illustrate through an example why we do not average over daily positivity rates in the usual way, but rather take the ratio of cumulated cases to cumulated tests. If then current viral speed is larger than average viral speed, we are in a situation of viral acceleration and the pandemic is on the loose. In that case, our acceleration index ε T is larger than one, which means that increasing tests by 1% leads to more than 1% of new cases. An arguably legitimate goal of public health policy would therefore be to make sure that the acceleration index gets smaller than one, i.e. that current viral speed becomes smaller than average viral speed: this would indicate that the pandemic decelerates and becomes under control. Ideally, one would like to find ever fewer cases the more one tests. This reasoning also shows why it is not sufficient to look at positivity rates alone-they only indicate viral speed. What matters for public health is to understand whether speed becomes greater or smaller compared to its historical average as tests increase, which is what our acceleration index measures. In S1 Appendix, we give an example using exponential growth, for which closed-form solutions are derived and can be used to further illustrate the interpretation of the acceleration index as a unit-free elasticity that relates to how the current viral speed compares to its historical average. Regarding the reproduction number, we make a rather general assumption, in accordance with the mathematical literature on epidemics, that the reproduction number is essentially a gross rate of growth and, as such, can be written at date t as: where f t is a function of new cases from date t to date t − n, which can be thought of as the infectious potential, that is, the average number of people who have been infected at t and before, and who can infect people at t. The lag parameter n is related to infection duration. The assumed time dependence of f t may capture many different phenomena that influence the number of cases, including for example health policy decisions but also the emergence of new strains of the virus. However, one specific factor that we have in mind here is the observation that the amount of performed tests is time-varying and so will be cases. Specifications for f t have been used in the literature. We focus in this paper on a specific method, captured by equation (9) on page 3 in [2], which defines the time-varying effective reproduction number as follows:R where the weights w's capture the generation time distribution, with P n j¼0 w j ¼ 1. This means that the time-independent functionf that follows from the denominator in Eq (3) is, in that case, assumed to be linear in the number of cases p (which does not imply that p t is linear in time of course). Note that such an assumption implies that, given the reproduction numberR, the dynamics of new cases follow an auto-regressive process AR(n). Although for the sake of presentation we focus on this specific method to estimate the effective reproduction number, our analysis extends to possible alternatives.
Even though it might go unnoticed at first sight, we should stress that a major difference between the rather general definition of R in Eq (3) and the specific definition ofR in Eq (4) is that the function f t implicitly depends on calendar time t. That is, what the latter takes account of is simply the number of cases detected, but not the fact that those cases will depend on the diagnostic effort or number of tests that has been realized. The fact that the diagnostics dimension is largely ignored in the literature about SIR-type models surfaces, for instance, in [17], who relate the epidemic growth rate to incidence and generation time interval only. It seems reasonable to assume that infectious and emerging diseases involve a diversity of pathogens, which require a variety of technologies to diagnose. In the context of COVID-19, PCR and antigen testing is of course key. This difference in accounting for cases turns out to be important to understand the connection between the acceleration index and the reproduction number, as we now show.
Using Eqs (2) and (3), we can relate our acceleration index and the reproduction number in the following way, at end date T: Eq (5) shows that the ratio between the acceleration index and the basic reproduction number can be itself decomposed into a ratio. The numerator of this latter ratio, A, can be thought of as the current infectivity intensity, that is, the ratio of the average number of primary cases up to period T who can originate infections in T as a fraction of the historical average of the number of persons who have been infected since the outset of the pandemic. The denominator, B, on the other hand, represents the number of tests in period T compared to its historical average up to T, that is, the current test intensity. To sum up, the ratio of the acceleration index to the reproduction number is, in any period, the ratio of the infectivity intensity to the test intensity.
From Eq (5) we see that both indicators are equal at all dates t, that is, ε t = R t , if and only if: Eq (6) is very important to conceptualize the core idea of this paper: in order to properly control for the (time-varying) volume of tests/diagnostics, one needs to use the appropriate function f t , that is, one which depends on calendar time because tests do. Said differently, the function f t should be specified in such a way that it takes account of the fact that cases are produced by tests or any other diagnostics. The linear form with no time dependence which appears in the denominator of Eq (4) is therefore problematic, as it assumes away tests which are however key to measure the pandemic's dynamics. In this sense, the acceleration index ε nests the basic reproduction number R: if the function f t is specified as in Eq (6), R is equivalent to ε as it takes account of testing; in any other case, ε is more general than R, as defined for example in Eq (4) that takes account of cases only. So as to elaborate more on why the acceleration index nests the reproduction number, in the sense that the former is a test-adjusted version of the latter, let us consider two hypothetical cases. The first case obtains when daily tests are constant at all dates, which implies that i.e. B t = 1, and that Eq (6) now reads as f t ð�Þ ¼ 1 t P t . This means that in the case of time-invariant tests, the acceleration index and the reproduction number coincide if and only if the numerator of the infectivity intensity A t is equal to the time average of all cases since the initialization date. This contrasts with the denominator in the expression ofR in Eq (3): it defines a time-independent functionf ð�Þ ¼ P n j¼0 w j p tÀ j as a moving average of the number of cases over a time window of length n + 1, which implies that Eq (6) is then violated. In the rather specific configuration such that tests are constant over time, one can see that the acceleration index and the reproduction number, as defined in (3), might differ because of the time window over which cases are included in the definition of the indicator of viral spread. More precisely,R assumes a time lag that relates to the observed generation time of the disease and it is defined as the ratio of current cases over a weighted moving average. Since the latter tracks more closely the trend in the number of cases,R captures the proportional deviations from that short-run trend. In contrast, ε relies on the entire history of the pandemic since the number of cases is divided by a much smoother and longer-term trend (technically, the cumulative moving average), thus capturing shorter-term trends. The latter aspect is arguably relevant for public health decisions, due to possible path dependence and regime shifts related, for example, to mitigation policies and other changing behaviors. In addition, cases rising fast might make it more probable that new strains of the virus emerge(d) and revive the pandemic, thus creating a positive feedback loop. In addition, another benefit of de-trending the daily positive rate by the average positivity rate is to make ε a unit-free measure of viral acceleration that is useful to compare groups (see for example Baunez et al. [18] on vaccine effectiveness).
A more realistic configuration in view of the COVID-19 pandemic, however, is when tests do vary over time. Suppose they do but that, unrealistically, the daily number of newly found cases is now constant over time. In that case,R t ¼ 1 and the associatedÂ t ¼ 1. In such a situation, Eq (6) is again violated because d t 6 ¼ 1 t D t , hence B t 6 ¼ 1. Such a violation signals not only that the acceleration index and reproduction number do not coincide, but also that the latter is not an accurate indicator of viral spread when tests vary over time but cases hypothetically do not. Either the test intensity is larger than one, meaning that the current level of tests exceeds its historical average so that d t > 1 t D t (hence B t > 1). The reproduction number then overestimates viral acceleration compared to the acceleration index because, while new cases are still constant, current tests are above their historical average. In that case, the acceleration index is smaller than one and signals deceleration of viral spread, despite cases being constant over time. Or the test intensity is smaller than one (i.e. B t < 1), so that the reproduction number underestimates viral spread acceleration because current tests, while being below their historical average, still detect the same number of cases. The acceleration index is then larger than one, indicating indeed acceleration of viral spread. Such benchmark cases shed light on the reason why the reproduction number needs to be adjusted to take into account tests when they are time-varying.
A schematic example to help visualize the latter case is presented in Fig 1, assuming constant population size for simplicity. Suppose that the numbers of both tests and positive cases have been constant prior to date t and contrast the alternative outcomes at t + 1 (scenarii 1 and 2). At each date, a square represents total population and each dot represents one individual. Red areas include individuals who have been tested, among which green (respectively red) dots represent positive (respectively negative) individuals, while the complementary grey areas include untested individuals with unknown health status. From the situation at date t depicted in the left panel, two exclusive scenarii originate at t + 1, depending on whether the number of tests goes down (upper right, scenario 1) or up (bottom right, scenario 2), while the number of positive cases stay constant across scenarii, between t and t + 1. Note that using only the number of positive cases (hence ignoring the number of tests) leads to the conclusion that the epidemic situation has not changed since the (two-period) reproduction numberR equals one at both dates. In other words, the epidemic situation neither worsens nor improves. However, comparing scenarii 1 and 2 in Fig 1 clearly reveals that the two alternative dynamics differ sharply: in the upper right panel, the number of tests goes down between t and t + 1, so that an equal number of cases is detected with a smaller number of tests, indicating an accelerating epidemic; in contrast, the lower right panel depicts a situation in which the number of tests increases markedly while an equal number of cases still materializes, indicating now a decelerating epidemic.
In addition, it is perhaps instructive to go through the logic that delivers the magnitudes for the reproduction numberR and acceleration index ε in Fig 1. Both equal one at date t, assuming again for simplicity an identical situation prior to that. Since cases do not changer over time,R does not either and stays equal to one at both dates. In addition, it follows that a half of total cases-cumulated over the two periods-is detected at t + 1 equally in both scenarii. However the contribution to cumulated tests at t + 1 is not the same in both cases, compared to t. In scenario 1, tests are halved so that the contribution to cumulated tests at t + 1 is only 1/ 3-that is, (1/2) � (1 + 1/2). As a consequence, ε t+1 = (1/2) � (1/3) = 1.5 in scenario 1: since as much as a half of total cases cumulated over the two periods is detected using only a third of cumulated tests, the epidemic is accelerating in period t + 1, as signalled by the property that ε tþ1 >R tþ1 ¼ 1 then. In contrast, the number of tests doubles at t + 1, in scenario 2, so that the contribution to cumulated tests is now 2/3-that is, 2 � (1 + 2). It follows that ε t + 1 = (1/2) � (2/3) = 0.75: the epidemic is decelerating since only a half of cumulated cases is detected using as much as as two thirds of cumulated tests, implying thatR tþ1 ¼ 1 > ε tþ1 . This example further illustrates why taking into account time-varying tests makes for a more accurate measure of epidemic acceleration. Note also that while the simple example in Fig 1 may give the impression that the positivity rate suffices to capture the epidemic dynamics, it is worth reiterating that it is a measure of speed which is not unit-free, as opposed to the acceleration index which is indeed a unit-free measure of the extent to which viral spread accelerates. Finally, incidence rates are not really useful to capture viral dynamics when population is constant, as in the example, or, more realistically, changing slowly.
Two general observations follow the above description of benchmark cases. First, when test intensity is larger (respectively smaller) than infectivity intensity, the reproduction number tends to overestimate (respectively underestimate) viral acceleration compared to the acceleration index. This implies that the reproduction number must be test-adjusted if it is to serve well as an accurate enough indicator of viral spread that guides public health policies. Second, following the logic of the first hypothetical case outlined above, one might envision also versions of the test-adjusted reproduction number that would divide the expression in (3) by a similarly defined growth rate of tests, over a rolling window. For instance, a short-term testadjusted reproduction number could be alternatively defined as: given the lag parameter m and P m j¼0 z j ¼ 1. The expression in (7) is a simple test-adjusted version of (4). It would be interesting to investigate the properties of such a test-adjusted reproduction number, defined over rolling windows, and compare it to the acceleration index. As underlined above, keeping all the past of the pandemic is important, for instance to compare groups and derive a unit-free measure of vaccine effectiveness (see [7,18]), and possibly to capture path dependence. Although not addressed in this paper, whether the new indicator that is defined by Eq (7) may turn out to be useful for other purposes is an open question. Finally, although this is beyond the scope of this paper, which abstracts from specific models of epidemics, we would like to stress some unreported results from simulation exercises. In a SIR model augmented to include time-varying tests, and in which only tested individuals among infected ones are observed, we have performed simulations indicating when the acceleration index captures more accurately the epidemic peak and deceleration than the reproduction number with imperfect information (and not test-adjusted). Interestingly, this happens in particular when tests become progressively available in a way that might lag the unobserved epidemic peak. Although of course model-and-parameter-specific, such simulation results go in the direction of the model-free results in this paper that test-adjusted versions of the reproduction number, such as the acceleration index that we advocate, better track the dynamics of viral spread.
To go back and better grasp the relation between the more general reproduction number R (not to be confused withR) and ε as indicated in Eq (5), a more theoretical analysis and its implications may be helpful. First of all, as it also becomes clear from Eq (5), when A = B-that is when Eq (6) holds-then ε = R. This basically means that if the test intensity B tracks the dynamics of the infectivity intensity A, there is enough testing to capture viral activity. In fact, seen from this perspective, we have a clear testing strategy: the daily tests d t need to offset the assumed infectivity captured by function f t , and more specifically Eq (6). The smaller the total number of cases, the easier it will be to match that testing requirement in particular through contact tracing. As total cases go up, contact tracing and sufficient testing may come to its structural and systemic limits. This in itself is a sign that additional health policies will need to be promoted.
In sum, if A > B, then ε > R, whilst when A < B, ε < R. In the former case, the infectivity intensity A of the pandemic cannot be matched by the test intensity B in place. That is, R does not give the appropriate picture of the infectiousness of the pandemic, in fact it underestimates it. To alleviate this bias, either testing would need to be increased, or viral spread would need to be cut by establishing policies that reduce contacts or a mixture of both. In any case, it shows that Eq (3) that composes R does depend on more than past and current cases, because they themselves depend on tests and other factors that may favour or not transmissibility. Conversely, in the latter case, R will overestimate the speed of the pandemic if the test contribution B is greater than the infectivity contribution A. In such a situation, greater testing than underlying infectivity will necessarily find more cases, actually too many to reflect the correct transmissibility. To capture the correct picture, either testing would need to be reduced, which however seems counterproductive at least to the extent that testing is a way to look at the underlying viral dynamics, or the infectivity function f t of Eq (3) needs to be adapted to reflect reduced transmissibility.
In the next section, we apply the theoretical decomposition outlined above to capture how the reproduction number and acceleration index differ in the context of the current COVID-19 epidemic in France.

Results and discussion
Before we turn to the application of the above analysis to French data, it might help to dissect a simple example showing in more details how and why the reproduction number and the acceleration index might differ. A thorough analysis of (5) requires structural assumptions, in particular to generate predictions about how both ε, R and their ratio move over time. In fact, the simple case of deterministic exponential growth, following equation (12) in [2], comes in handy here. Time is assumed to be continuous, to ease derivation of results, and the number of cases grows exponentially over time, as usually assumed in epidemiological models, of SIR type and related for example. In such a case, the continuous-time analog ofR (not R now) defined in Eq (4) is constant and any difference betweenR and ε is due to differences between A and B. For ε, we also have to introduce tests and we assume that they also grow exponentially.
Under those assumptions, we show in S1 Appendix that while the reproduction number is constant over time, the acceleration index is not, as it features different regimes depending on how the growth rate of daily cases compares with the growth rate of daily tests. For example, when the former is larger than the latter, the acceleration index first rises and then approaches a plateau, where it equals the ratio of growth rates, which is larger than 1 in that case. In contrast, the reproduction number stays constant over time. We can visualize this more easily in the simple setting of exponential growth (see S1 Appendix), but it also holds more generally that the difference between both indicators is essentially due to the fact that while the acceleration index is the ratio of two growth rates, that of cases divided by that of tests, the reproduction number tracks only the former, thus ignoring the latter. It is also for this reason that we say the acceleration index ε nests the basic reproduction numberR, which is simply a special case of ε. Different configurations may in principle occur, therefore, over time, depending on how fast cases grow compared to the growth of tests.
To further illustrate what happens in the case of exponential growth outlined above and studied in more details in S1 Appendix, we now provide an illustration such that the growth rate of daily cases is twice as large as the growth rate of daily tests. Fig 2 illustrates how the acceleration index and the reproduction number, as well as the infectivity and test intensities, evolve over time in this particular example. In Fig 2, panel (a), we report the evolution over time of the time-varying acceleration index ε(t) and the constant reproduction numberR that follow from the numerical example. In S1 Appendix, we show that whileR is constant, ε tends to the ratio of growth rates, which is equal to 2 in the example. As a consequence, a first regime with the reproduction number exceeding the acceleration number happens, followed by a second regime that features the reverse configuration. Not surprisingly, panel (b) in Fig 2 shows that the first regime materializes before day 9, when B(t) > A(t)-that is the test intensity exceeds the infectivity intensity-while the second regime is associated with A(t) > B(t) after day 9. Panel (b) reveals in particular that the plateau for the acceleration index that is featured in panel (a) comes from the fact that both A(t) and B(t) grow at the same rate, with the infectivity intensity exceeding the test intensity. Overall, therefore, the acceleration tracks the ratio of growth rates-which equals 2 in our example-while the reproduction number underestimates that ratio because it roughly reflects only its numerator.
Interestingly, we can derive from panel (b) in Fig 2 an operational tool to track, and possibly to control, in real time the difference between ε andR. Fig 2 illustrates an acceleration epidemic phase, where positive cases grow faster than tests. As a consequence, test intensity B(t) eventually lags behind, infectivity intensity A(t). However, the implication thatR underestimates viral acceleration after day 9, compared to ε is not unescapable. In fact, one could aim at controlling the ε=R ratio: as data is updated in real-time, whenever it is observed that the infectivity intensity gets larger than the test intensity (that is, A > B), one should increase the latter by making sure that daily tests accelerate. Ideally, the test intensity should track as closely as possible the infectivity intensity, in order to optimize the accuracy ofR as an indicator of viral acceleration. Note that this goes in the same direction as the effort by public authorities to control viral spread by testing more and isolating the detected positive cases as much as possible. In that sense, to ensure that testing accelerates in the run up to an epidemic peak has two benefits: improving the epidemic situation, in so far as testing more contributes to the control of viral spread, and increasing the accuracy of the indicating tracking viral acceleration. Admittedly, tracking ε in real time is a more parsimonious way to attain accuracy, compared to trackingR as well as A and B (or their ratio). Our decomposition shows formally that both approaches are equivalent though, and this result does not rely on exponential growth but holds more generally.
The conclusion that the reproduction numberR alone may poorly capture the dynamics of viral spread when tests vary over time, as expressed formally in Eq (5), is not a mere theoretical curiosity. It strongly suggests possible pitfalls associated with its exclusive use in guiding and evaluating public health policies such as Non Pharmaceutical Interventions (NPI thereafter) in practice. Either ε, orR together with the ratio of infectivity and test intensities A/B, should be used to track viral acceleration. Casual reading of the literature reveals, however, that the reproduction number alone is widely assumed as the success metric to assess the effects of NPI (see e.g. [19] among many others). However, it should be clear by now that the conclusions thus derived should be considered with caution, at best, whenever tests are not accounted for. For example, [20] state that " In Fig 4c, 'enhancing testing capacity' and 'surveillance' exhibit a negative impact (that is, an increase) on R t , presumably related to the fact that more testing allows for more cases to be identified." Although increasing testing might indeed lead to an increasing reproduction number, it does not follow that such a NPI has an adverse effect on the epidemic, especially if tests are rising as much as during the March-April 2020 period considered by the latter authors in their study. Again, we cannot stress enough that whenever data about how many tests are performed is readily available (as in [20] but also for many other related studies), it should be used to measure as accurately as possible the dynamics of viral spread and adjust the reproduction number using property (6). Obviously, testing does not realistically capture all infected individuals unless all the population is tested each and everyday, but this observation should push policies towards testing as much as possible, not towards ignoring data about tests altogether. In addition, our analysis below clearly shows that completing the reproduction number with the positivity rate (that is, the ratio of positive cases to tests) is not a satisfactory answer either, as it might deliver conflicting evidence such as the former falling down to 1 and the latter shooting up. Fig 3 summarizes our main empirical results obtained from data for France over the period that runs from May 13, 2020, to October 26, 2022. Note that data for France is available only for the period following the end of the first and longest lock-down, which extended from about mid-March to mid-May, 2020, and unfortunately not since the onset of the pandemic. More details about the input data and output variables used in the analysis of this section are gathered in Table 1 of S1 Appendix. In panel (a), we report both the acceleration index (blue curve) and the reproduction number (green curve) over time. The grey area represents the period before vaccination against COVID-19. Yellow areas depict the second and third lockdown periods. The yellow area ends more or less when the pink area starts, and this is when the first vaccination campaign begins around the end of year 2020 (see Table 2 of S1 Appendix for precise dates). The acceleration index ε is computed using Eq (1) while the reproduction numberR is computed from Eq (4) with n = 7 and equal weights w. Note that the infection kernel could be adapted to account for sub-exponential growth as in [21]. The lower spikes of the reproduction numberR are due to much lower amount of testing during week-ends. This can clearly be seen in the panel (d) of Fig 3 that depicts the number of daily tests (in pink). We plot in all panels except (c) the raw variables rather any smoothed estimates in order to avoid any additional layer of interpretation.
In panel (c) of Fig 3, we present local polynomial regressions forR and ε of panel (a), that use the Savitzky-Golay filter also known as a locally estimated scatter-plot smoothing method in modern statistics (see [22]). The blue line is again our acceleration index. In red, we depict the reproduction number published by Santé Publique France, whereas the green line represents our own estimation of the reproduction number. As can be noted, the reproduction number estimated by Santé Publique France does not fall far outside the confidence bands of our own estimate which are the dotted green lines. Most importantly, both estimates of the reproduction number cross 1 at about the same dates. Even though Santé Publique France refers to the "Cori method", we have not found public information about the precise weights attached to past values for the number of cases in computing infectivity. In addition, as can be seen from the confidence bands, the acceleration index is estimated more precisely than the reproduction number, because we take account of variations of tests and thus cases due to the week-end effect. In effect, being a ratio of growth rates (that of cases over that of tests), the acceleration index turns out to be smoother than the reproduction number (which is closer to the growth rate of cases and thus drops sharply over week-ends). Let us now center the discussion around panel (c) of Fig 3 and focus first on the period preceding vaccination, which is represented by the grey area. We see that right after the end of the first lock-down, both indicators are hovering below 1, withR being slightly above ε. We concentrate here on the green line, i.e. our estimation of R, since the estimate by Santé Publique France is no longer included in the public data-set since August 12, 2022. Our estimatedR then rises quickly to a higher plateau in the first half of July 2020 to indicate greater transmissibility, and stays at a level of about 1.2 until mid-August 2020. At that same time ε first remains put at a level smaller than 1 and becomes greater than 1 a few days later, effectively crossingR at the beginning of August and accelerating all along until about mid-August 2020.
The difference in dynamics of both indicators for France can easily be explained by looking at panel (b) in Fig 3. Here, we report the two terms that appear in Eq (5), that is, A, the infectivity intensity (orange curve), and B, the test intensity (red curve). The latter graph exhibits spikes, again due to the fact that much less tests, if any, are performed during week-ends. The test intensity follows a downward trend that simply reveals the fact that tests being done in a given period constitute, over time, a smaller and smaller fraction of the cumulated amount of diagnostics. What we see in panel (b) in particular is that before July 29, the test intensity B is greater than the infectivity intensity A that has, at first, also a downward trend. A greater testing rate implies that more cases will be found. This corresponds to the period whenR is greater than ε. ButR basically overestimates viral activity because it does not consider tests and focuses on cases only, while ε takes account of this because it looks at the ratio of both infectivity and test contributions. The opposite is true for the period after July 29, when the infectivity intensity A becomes greater than the test intensity B, hence ε greater thanR. Despite growing daily tests until the end of August, as we can see in panel (d) of Fig 3,R remains first at a plateau but then steadily declines until the second half of September.
Comparing ε with R over the summer of 2020 is even more striking. Soon after the acceleration index rises above the reproduction number, around August 5, 2020, both measures start to diverge and hence deliver opposite messages regarding the evolution of the pandemic. While R decreases from about 1.5 in August 15 to about 1 in September 22, ε goes up from approximately 1.7 in August 15 to 1.9 in August 25, and plateaus at the latter value until September 22. In other words, not accounting for time-varying tests over that period gives the impression that the pandemic situation improves, while on the contrary it is shown to worsen once we compare as we should the dynamics of the infectivity and test intensities. As shown in panel (b) in Fig 3, the period during which R goes down to reach 1 is also a period when the infectivity intensity A either rises faster or declines more slowly than the test intensity B: as a consequence, the acceleration index first rises and the plateaus around 1.9, indicating again a worsening of the pandemic. This in effect means that 1% of more of cumulated tests delivers 2% more of cumulated cases around September 22. A worsening of the pandemic indeed, which continues until the second lock-down depicted by the first and leftmost yellow area, with ε reaching about 2.5. While R also rises again from 1 after September 22, it remains true that looking separately at the reproduction number and at the positivity rate delivered conflicting messages about the pandemic resurgence, with the former indicating an improvement and the latter showing a worsening. In contrast, the acceleration index consistently indicated that the pandemic was still in an acceleration regime that at best stabilized before worsening again.
This clearly shows thatR has been unable to represent viral acceleration basically during both summer months because, as we can see from panel (b), the infectivity intensity is rising more quickly than the test intensity. Therefore,R overlooks the testing dimension and only captures the number of cases, but those are undervalued given a lower test rate B. Even worse, testing then declines at the end of August while the infectivity function f, indicated in black in panel (d) starts going up. This affectsR that declines to reach a level of about 1 by the end of September. This is in very stark contrast to ε that accelerates from early July onwards and then hovers at a plateau of about 2 up to end of September. It takes appropriately into account the relationship between the changing growth rates of testing and infectivity.
Both indicators go up again from the end of September onwards as testing rises again. But whileR reaches a plateau again as the first curfew measures where put into place to cut transmission, ε further indicates acceleration. Both indicators then start declining when, at the end of October, the second lock-down was put into place. However,R < 1 since the beginning of November, whilst at the same time, our indicator ε still shows an ongoing acceleration, although a reduced one with respect to the time before the lock-down. What we see very clearly from panel (d) is that lock-down coincides with a great reduction of testing. Obviously, lock-down is aimed at reducing contacts and thus viral spread. This will necessarily reduce cases and hence R declines. But if at the same time testing is reduced as well, which is the only way to get a clearer picture of the viral activity, this necessarily influencesR more dramatically and explains the under-evaluation of the viral spread than ε that continues to indicate acceleration. More specifically, ε indicates what happens to cases when we reduce testing by some percentage change. The fact that the percentage change of cases goes in the same direction as the percentage change of testing, i.e. that both decrease, is a good sign and indicates that lockdown measures have their effect. But looking at ε does not yet allow to give an all-clear such as R does. As a consequence, ε captures more accurately the considerable time variation of virus propagation thanR.
Similar observations about the discrepancies between both indicators can be derived from the period that follows the first vaccination campaign in France, around the turn of year 2020, which is depicted by the pink area in panel (c) in Fig 3. In particular, the peak due to the Delta strain shows up as a rising acceleration index, with a local peak around August 4, 2021, which however stays in the deceleration regime with values much below 1. In contrast, both the estimate of Santé Publique France and our own estimate of the reproduction number take values much larger than 1 during the whole month of July 2021. Even more revealing is the period starting mid-November 2022, which saw the Omicron strain become progressively dominant. While the reproduction number computed by Santé Publique France goes down from about 1.6 on November 16 to about 1.1 on December 11, 2021, thus indicating an improvement, the acceleration index goes up from around 1 to 1.5 between those two dates, showing in contrast a return of the pandemic in the acceleration regime. Even more strikingly, the Omicron dominance period which begins at the end of December 2021 is associated with a continuous, albeit declining, worsening phase according to the acceleration index. In sharp contrast, the reproduction number shows alternating periods of exploding and dampening transmissibility, hovering around 1. To sum up, panel (c) in Fig 3 reveals that the year 2022 is associated with a significant bias of the reproduction number due to not accounting for time-tests (the magnitude of which can be seen in panel (d)). The origin of such a bias, we argue, is best understood using the decomposition between the test and infectivity intensities: as shown in panel (b) in Fig 3, over the year 2022 the infectivity intensity consistently wins the horse race against the test intensity, as the former stays larger than the latter. As a consequence, the reproduction number significantly underestimates the acceleration of viral spread, which turns out not to have been reduced by vaccines if one compares the grey and pink areas.
To make clear that the discrepancy between the reproduction number and the acceleration index shown in Fig 3 is not specific to France, we report in S1 Appendix a similar decomposition for five other countries over the pre-vaccination period. Such a comparison reveals that To sum up, two main differences between the reproduction number and the index appear in panel (c). The first being thatR crosses unity earlier than the acceleration index, which starts to increase around July 6. This is most likely due to the infectivity rate reaching its lowest point around that date (as seen in panel (b)). Passing that date, the infectivity rate begins to increase, while the test rate continues downwards. This explains why the acceleration index could not start growing before July 6. Overall, therefore,R is larger than ε before August 5. The second difference is a sudden decrease of the reproduction number in the second half of September, while the acceleration index stays at a plateau. This can be explained by panel (d), in which we see a sharp plummet in the number of daily tests around that period. Less tests equals less detected cases, whichR relies heavily on for its calculation. SeeingR rising sharply from about 1 in early October is all the more surprising when seen in isolation. In contrast, the acceleration index, which accounts for variations of both cases and tests, consistently shows a succession of periods of steep rise followed by plateaus over the summer and until the second lock-down.
In practice, many public health agencies report (daily or weekly) positivity rates, to complement the information contained in the reproduction rateR. In light of the connection with ε that we have highlighted in this paper, both formally and empirically, we argue that the acceleration index is closer to a sufficient statistic that helps tracking the rapidly changing dynamics of any pandemic, because it explicitly takes into account the dynamics of diagnostics. All in all, the acceleration index is a test-adjusted reproduction number. In the context of COVID-19, diagnostics equal tests, but our claim is valid more generally when this is not the case. This means that the acceleration index can potentially be applied to any effort designed at detecting infected people, no matter what the pathogen agent triggering the infectious disease turns out to be. In real time, this is quite valuable, we believe, to guide health policies and to assess containment measures, especially in the context of a new pathogen appearing (such as SARS-Cov-2), with unforeseeable pandemic dynamics.

Conclusion
We show in this paper that the reproduction number is a special case of the acceleration index proposed in [6]. While the former only considers the growth rate of cases, the latter measures variations of cases in relation to that of tests, and it does so in a unit-free manner since it is an elasticity. Most importantly, the acceleration index is a sufficient statistic of viral spread in the sense that it aggregates all the relevant information in a synthetic manner. In contrast, looking at pieces of information, like positivity or prevalence rates, separately might lead to the misleading conclusion that there is conflicting evidence about whether the epidemic worsens or improves. As such, a test-controlled reproduction number like the acceleration index should be part of any data dashboard to track an epidemic, and especially to guide public policy in the design of the most efficient methods to curb it. For example, we have shown in [6] that an accurate measure of virus circulation is a key input to feed algorithms that are designed to efficiently allocate the diagnostic effort across space.
The result that the reproduction number is, as a measure of viral spread, subject to a considerable bias is not specific either to France or to the period considered, as illustrated using data from five other countries. Such a comparison reveals that the reproduction number might either underestimate or underestimate viral acceleration, depending on the country and the time period, due to not correcting for the time-varying amount of testing. This readily suggests that this issue might be highly relevant in many other countries as well, where public health authorities are also in dire need of accurate indicators to track epidemics. As such, this observation should also preoccupy international bodies that design cooperation strategies to fight pandemics, including of course the World Health Organization and other regional agencies.
Relatedly, a key conclusion follows from our theoretical and empirical results. If public health authorities aim at measuring as accurately as possible viral acceleration, they have to rely on one of the following strategies: track in real time either the acceleration index alone, or a combination of the reproduction number together with the test and infectivity intensities. Although both strategies are formally equivalent, the latter is not only less parsimonious, it is also arguably more delicate to operate in practice since one would then like to control the bias that inevitably comes from time-varying tests. This is one of the main reasons why we argue in favor of using the acceleration index.
Such observations make the acceleration index a more parsimonious indicator to track a pandemic in real-time, as it is context-dependent: the acceleration index takes into account the effort to diagnose people who have been infected by the pathogen. In the case of COVID-19, diagnostics equal PCR (and other types of biological) tests, but this might not be the case for other diseases where diagnostics require even greater effort. However, our analysis makes a strong case for incorporating in any measure of pathogen circulation the observed effort to diagnose the agent that makes people sick.
Even though there is a variety of infectious (and emerging) diseases, with different pathogens and various ways to diagnose them, we claim that our conceptual approach is general enough to shed light, not only on the current pandemic, but also on any future ones which may come. In addition, our analysis extends to alternative methods to estimate the effective reproduction number, beyond the specific example stressed in this paper for the sake of presentation.
Finally, we would like to mention some limitations of our analysis. Some important issues, beyond the scope of our paper, have however been addressed by the literature. First, the fact that unaccounted cases arise when testing is not compulsory (see [23] for France). Second, the coexistence of symptomatic and asymptotic cases during COVID-19 has led to additional statistical methods (see [24]). Another limitation of our acceleration index is that its accuracy depends positively on the amount of tests performed. For instance, detecting infected people who are asymptomatic but still potentially infectious requires an active policy and enough testing capacities. Even though the more tests the better in terms of how accurate the acceleration index is, should be included in the analysis that testing policy requires costly resources. From an economic standpoint, future research should also examine cost-efficiency of policies aims at mitigating epidemics, including using testing actively.